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1 . INTRODUCTION 


At the close of the Large Area Crop Inventory Experiment (LACIE) Phase III 
crop year, several Investigations were outlined in support of the Classifica- 
tion and Mensuration Subsystem (CAMS). The goal of the secondary error 
analysis plan was to use Procedure 1 to evaluate as many of the error 
sources as possible in the small-grains estimate for 5- by 6-nautical-mile 
segments in the U.S. Great Plains. The component of the plan which is 
considered herein is the evaluation of analyst labeling errors on type 1 
and type 2 dots. The purpose of this study is to determine the ef*‘ects 
of the mislabeling experienced on classification and proportion estimation 
performance. Mislabeling errors are the only secondary errors that intro- 
duce boch bias and increased variance into the proportion estimates and 
are thus viewed as the most important source of errors. 

2. GENERAL DESCRIPTION OF PROCEDURES 

This experiment was conducted by relabeling the Procedure 1 dot grids 
(ref. 1, pp. 4-8 through 4-10) with ground-truth information and reprocessing 
the segments. Two types of dot grids were used in the test: the random 

dot grid used in Phase III and the uniform dot grid that is being used 
in the Transition Year. Each of these two proportion estimates was then 
compared to the latest acceptable CAMS estimate for the Phase III crop year. 

Relabeling of the Procedure 1 dot grids required an accurate ground-truth 
map on a pixel-by-pixel basis. To produce the ground-truth map for each 
segment, the county agent annotated an aerial photograph of the entire 
segment. The field boundaries were digitized in tne photo-cartographic 
laboratory, and both of these products were then used to produce a tape 
of all ground-truth information. This tape was, in turn, used in a com- 
puter program (ref. 2) to generate a gray-scale map with the crop code 
represented for each pixel. 
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The ground-truth maps were registered to the latest acquisition date 
processed by CAMS for which an acceptable proportion estimate was produced. 
These acquisitions are referred to as the base acquisition dates for the 
ground-truth maps. For the reprocessing of a test segment, the same base 
acquisition date was used In the generation of the ground-truth map, in 
the Phase III processing of the segment, and in the reprocessing of the 
segment using ground-truth labeled random and uniform dot grids. Agreement 
of these base acquisition dates ensured a minimal amount of misregistration 
between ground- truth maps and Images. 

The multiple processing of a test segment required that exactly the same 
multitemporal acquisition dates be used. The same combination of random 
grid dots was labeled using ground truth as had been labeled In the 
Phase III processing. Because the processing was kept as nearly the same 
as possible, the only variable being observed was the dot labeling error. 

3. DESCRIPTION OF THE EXPERIMENT 

The data set chosen for use in this study consisted of Phase III blind sites 
that had been worked using Procedure 1. The data included nine states 
within the U.S. Great Plains, with the number of segments In each state 
varying from one to five. Initially, a total of 30 test segments was 
chosen; however, because of Incomplete ground-truth coverage and problems 
in reprocessing, the final total included only 25 segments. These test 
data, along witi: information on acquisitions used in processing, are more 
fully described In table 1. Of the final 25 test segments, 5 segments had 
additional problems in the computation of a digitized ground-truth estimate 
(as describeo in section 2); therefore, a 400-dot-count estimate of the 
ground-truth proportion was used. 

A randomized complete block experimental design was used to analyze the data. 
The data were blocked by state with segments nested within the states. The 
treatments consisted of three classifications differentiated by dot labeling 
techniques: analyst-interpreter (AI) labeling of the random grid system. 
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ground-truth labeling of the random grid system, and ground-truth labeling 
of the uniform grid system. The model for the three analyses of variance 
lANOVA) was as follows: 

I'ijk •«*'>(♦ * 'k * “ik * ®ljk 

where 

u » the overall mean of the observations 

b^ ■ the block or state effect (i * 1 , 2, • •• , 9) 

s^j * the segment effect (j * 1 , 2, • • • , n^ as n^ varies with state; 

i * 1, 2, •••, 9) 

t|^ * the treatment or labeling effect (k * 1, 2, 3) 

bt., “ the block-by-treatment effect 
ik 

®ijk * random error for each observation 

^ijk ' response variable 

In this experimental design, the assumption was made that there is an 
interaction between treatments and states. This could occur because 
of varying field sizes among states, a fact which would affect the analyst's 
labeling capability. 

For the first two analyses, y^jj^ was the true probability of correct 
classification (PCC) for segment j within state i using labeling procedure k 
as calculated from type 1 starting dots (PCCl) and type 2 bias correction 
dots (PCC2), respectively. For these two analyses, only 20 test segments 
were included ranging over 8 states. This was because of incomplete ground- 
trutn information, which orohibited the calculation of the PCC in the 
remaining five segments. 


OF VtKlH 
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The response variable for the third analysis Is defined as follows: 

J-Ijk ■ - "Ijl 

where p^j|^ Is the proportion of wheat for segment j within state 1 using 
labeling procedure k, and p^^ Is the ground-triith proportion for segment j 
within state 1. 

A Newman-Keuls multiple comparison test was planned In the event that any 
of these ANOVA tests Indicated significant treatment differences. 

4. RESULTS 

Table 2 presents the results of computing the PCCl for each of the three 
labeling procedures (treatments). Table 3 lists the ANOVA test results 
of the PCCl. The states, the segments within the s..ates, and the treatments 
were found to be significantly different at the 5-ptrcent level of signifi- 
cance. The Newman-Keuls test was pc -formed, and the results appear In 
table 4. Means are arranged In ascending order, and diflerences at the 
5-percent level of significance are separated by brackets. There were 
differences found between the AI labeled procedure and each of the two 
ground-truth labeled procedures. However, no difference was found between 
the two ground-truth labeled grid systems: uniform and random. 

Table 5 presents the results of computing the PCC2 for each of the three 
treatments, and table 6 gives the corresponding ANOVA test results. The 
states and the segments within the states show differences at the 5-percent 
level of significance. There was no significant difference between treat- 
ments, as Indicated by the similarity of the treatment means. 

Table 7 presents the proportion estimates for the three treatments and the 
ground-truth proportions. Table 8 presents the differences between the 
ground truth and the AI labeled proportion (A^j); the ground-truth propor- 
tion and the ground-truth labeling of the random dot grid proportion (A[^); 
and the ground-truth proportion and the ground-truth labeling of the uniform 
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dot grid proportion (A^). Table 9 is the ANOVA table prepared using these 
differences. No significant difference was found in any of the factors 
examined. 


5. CONCLUSIONS 

From this examination of 25 test segments using AI labeling and ground- 
truth labeling, the PCC on type 1 dots was found to be significantly better 
for both types of ground-truth labeled procedures than the PCC obtained 
using AI labeling. No significant difference in the PCC was found for 
type 2 dots. However, in all three treatments, the type 2 dots included 
pixels which fell on boundaries or were mixed pixels (ref. 1, pp. 7-1 
and 7-2). This would account for all PCC2 values being equally low. 

The proportion estimates achieved in these classifications showed no 
significant differences between procedures. By reviewing table 8, the 
large variance awng the proportion estimates is obvious; this may account 
for the nonsignificance found in table 9. However, the ranking of the 
treatment means places both types of ground-truth labeled procedures above 
the AI labeled procedure. It was expected that ground-truth labeling 
would be a significant improvement to the classifier. Using Procedure 1, 
however, comparable estimates were achieved regardless of the type of 
labeling procedure. 
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TABLE 1 TEST DATA 


OKl(]lN'AL |»ALiK liJ 

OF IVKm gl’AIJTY 


Segment* 

Location 
(county, state) 

Acquisitions used 

Robertson blostage^ 

1005 (W) 

Cheyenne. Colo. 

7177, 7159, 6326, 6254 

6.0, 4.8, 2.5, 1 .0 j 

1032 (U) 

Wichita, Kans. 

7194, 7086. 6326, 6254 

6.0, 2.7, 2.5, 1.0 

1033 (W) 

Clark, Kans. 

7156, 6288 

5.1, 2.2 ! 

1853 (W) 

Ness, Kans. 

7193, 7067, 6253 

6.0, 2.0, 1.0 ; 

1861 (W) 

Kearny, Kans. 

7194, 7158, 7104, 6326 

6.0, 4.9, 3.0, 2.5 : 

. .. .. ^ 

1512 (S) 

Clay, Minn. 

7193, 7156 

5.2, 3.5 

1520 (S) 

Big Stone, Minn. 

7174, 7156, 7120 

4.4, 3.6, 2.0 1 

1544 (S) 

Sheridan, Mont. 

7198, 6294 

5.8, 0.0 1 

1739 (M) 

Teton, Mont. 

7222, 7168, 7132, 6263 

6. 0/6.0, 4. 2/3. 8, 3. 1/2. 5, 1. 8/1.0 

1582 (U) 

Hayes, Nebr. 

7194, 7158, 7086, 6254 

6.0, 4.6, 2.6, 1.0 

1604 (S) 

Renville, N. Dak. 

7143, 7125 

2.7, 1.9 ^ 

1606 (S) 

Ward, N. Dak. 

7197, 7125 

5.3, 1.9 1 

1648 (S) 

Bowman, N. Dak. 

7179, 7125 

4.5, 2.0 i 

1661 (S) 

McIntosh, N. Dak. 

7159, 7123 

3.5, 1.8 

1902 (S) 

McKenzie, N. Dak. 

7197, 7125 

5.2, 2.0 

1231 (W) 

Jackson, Okla. 

7156, 7066, 6238 

5.4, 3.0, 2.2 

1242 (W) 

Canadian, Okla. 

7173, 7155, 7101, 6287 

6.0, 5.6, 3.4, 2.0 

1367 (W) 

Major, Okla. 

7155, 7101, 6287 

5.3, 3.4, 2.0 

1677 (S) 

Spink, S. Oak. 

7211, 7193, 7176, 7140 

6.0, 5.7, 4.5, 2.9 

1690 (S) 

Kingsbury, S. Dak. 

7211, 7193, 6325 

6.0, 6.0, 0.0 

1803 (W) 

Shannon, S. Oak. 

7178, 7159, 7123, 6255 

6.0, 4.7, 3.2, 2.0 ! 

1805 (M) 

Gregory, S. Dak. 

7211, 7158, 6307, 6290 

6. 0/6.0, 4. 7/4.0, 2. 5/0.0, 2. 4/0.0 

1056 (U) 

Moore, Tex. 

7158, 7049, 6290 

5.2, 2.7, 2.4 

1059 (W) 

Ochiltree, Tex. 

7157, 7121, 6325, 6307 

5.1, 3.8, 2.5, 2.4 

1060 (U) 

Sherman, Tex. 

7158, 7068 

5.2, 2.8 j 


*W ■ winter wheat; S ■ spring wheat; M ■ mixed wheat. 
^Robertson scale adjustable crop calendar. 
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TABLE 2.- PCCl RESULTS 


Segment - state 

Labeling procedure 

Random 

Uniform 

AI 

1005 - Colo. 

87.2 

86.0 

BO 

1032 - Kans. 

89.7 

91.8 

IB 

1033 - Kans. 

88.0 

92.0 

H9 

1853 - Kans. 

96.8 

90.9 

87.1 

1055 - Tex. 

95.0 

m 

89.7 

1059 - Tex. 

87.8 

mm 

83.3 

1060 - Tex. 

97.8 

93.7 

87.0 


100.0 


97.2 


100.0 


96.8 

1544 - Mont. 

91.2 

71.7 

76.5 

1582 - Nebr. 

97.8 

97.9 

97.8 

1604 - N. Dak. 

90.2 

81.0 

61.5 

1606 - N. Oak. 

82.6 

80.6 

75.6 

1648 - N. Dak. 

83.7 

92.9 

71.4 

1661 - N. Dak. 

87.5 

91.7 

75.0 

1902 - N. Dak. 

88.0 

100.0 

96.0 

1677 - S. Dak. 

90.5 

83.3 

87.5 

1690 - S. Dak. 

100.0 

91.7 

94.9 

1803 - S. Dak. 

100.0 

hesbh 

98.0 

1805 - S. Dak. 

92.0 

wmMmmi 

80.0 

Total 

1845.8 

1819.2 

1717.4 

Average 

92.3 

91.0 



85.9 
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TABLE 3.- PCCl ANOVA TEST RESULTS 


Sourcj of 
variation 

Degrees of 
freedom 

Sum of 
squares 

Mean 

square 

F-value 

Level of 
significance, % 

State 

7 

16C' 

229 

9.2 

5 

Segment within 
state 

12 

927 

77 

3.1 

5 

Treatment 

2 

459 

230 

9.2 

5 

State by 
treatment 

14 

438 

31 

1.2 

NS 

Error 

24 

590 

25 



Total 

59 

4014 





TABLE 4.- NEWMAN-KEULS TEST OF PCCl 





[AI 

Uniform 

RaTiOom 

85.9] 

91 .0] 
92. 3J 


S 






















TABLE 5.- PCC2 RESULTS 


( 

1 

Segment - state 

Labeling procedure 

Random 

Uniform 

AI 

1005 - Colo. 

71.7 

78.3 

68.3 

1032 - Kans. 

76.3 

76.7 

79.7 

1033 * Kans. 

89.5 

83.3 

86.2 

1853 - Kans. 

85.0 

73.3 



78.3 

1056 - Tex. 

65.0 

■B 

81.7 

1059 - Tex. 

84.5 


77.2 

1060 - Tex. 

94.9 

iH 

83.1 

1231 - Okla. 

89.8 

96.6 

91.5 

1242 - Okla. 

86.8 

86.8 

83.6 

1544 - Mont. 

55.0 

61.7 

64.7 

1582 - Nebr. 

96.7 

93.2 

96.7 

1604 - N. Oak. 

76.7 

60.0 

63.3 

1606 - N. Dak. 

76.6 

62.5 

72.3 

1648 - N. Dak. 

66.7 

79.3 

61.7 

' 1661 - N. Dak. 

81.1 

76.0 

71.7 

1902 - N. Dak. 

83.3 

86.4 

90.0 

1677 - S. Dak. 

72.5 

76.9 

84.3 

1690 - S. Dak. 

88.3 

90.0 

88.3 

1803 - S. Dak. 

98.3 

98.3 

98.3 

1805 - S. Dak. 

92.9 

92.5 

89.0 

Total 

1631.6 

1618.5 

1609.9 

Average 

81.6 

80.9 

80.5 
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TABLE 6.- PCC2 ANOVA TEST RESULTS 


OUliJlNAL FAOK IS 
UK l’i>OU onAl.lTV 


Source of 
variation 

Degrees of 
freedom 

Sum of 
squares 

Mean 

square 

F-value 

Level of 
significance, X 

State 

7 

4068 

581 

mm 

5 

Segment within 
state 

12 

1873 

156 


5 

Treatment 

2 

12 

6 

.2 

NS 

State by 
treatment 

14 

253 

18 

.5 

NS 

Error 

24 

811 

34 



Total 

59 

7017 





Treatment 

Mean 

AI 

Uniform 

Random 

80.5 
80.9 

81.6 


10 





















TABLE 7.- PROPORTION ESTIMATES 



♦Indicates that the 400-dot-count estimate was used. 
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TABLE 9.-> PROPORTION ESTIMATE ANOVA 


Source of 
variation 

Degrees of 
freedom 

Sum of 
squares 

B 

F-value 

' "1 

Level of ; 

significance, T 

State 

8 

234 

29 

0.88 

NS j 

Segment within 
state 

16 

508 

32 

.97 

NS 1 

1 

1 

1 

Treatment 

2 

96 

48 

1.45 

NS ^ 

State by 
treatment 

16 

490 

31 

.94 

NS ! 

1 

1 

1 

Error 

32 

1067 

33 



Total 

74 

239b 



i 


Treatment 

Means 

Uniform 

5.0 

Random 

6.8 

A1 

7.8 










